Beyond Identity Coreference: Contrasting Indicators of Textual Coherence in English and German
نویسندگان
چکیده
This paper focuses on the interaction of chains of coreference identity with other types of relations, comparing English and German data sets in terms of language, mode (written vs. spoken) and register. We first describe the types of coreference and the chain features analysed as indicators of textual coherence and topic continuity. After sketching the feature categories under analysis and the methods used for statistical evaluation, we present the findings from our analysis and interpret them in terms of the contrasts mentioned above. We will also show that for some registers, coreference types other than identity are of great importance.
منابع مشابه
Experiments on bridging across languages and genres
In this paper, we introduce a typology of bridging relations applicable to multiple languages and genres. After discussing our annotation guidelines, we describe annotation experiments on the German part of our parallel coreference corpus and show that our interannotator agreement results are reliable, considering both antecedent selection and relation assignment. In order to validate our theor...
متن کاملSociocultural Identity in TEFL Textbooks: A Systemic Functional Analysis
This study aimed at investigating shades of identity in TEFL textbooks. Most identity studies have focused on authors as knowledge producers. They have neglected authors' roles in constructing identity. Further, few scholars have considered disciplinary specific textbooks in their analyses of identity. Trying to bridge these gaps, we applied Halliday’s Systemic Functional Linguistics to investi...
متن کاملA Tidy Data Model for Natural Language Processing using cleanNLP
The package cleanNLP provides a set of fast tools for converting a textual corpus into a set of normalized tables. The underlying natural language processing pipeline utilizes Stanford’s CoreNLP library, exposing a number of annotation tasks for text written in English, French, German, and Spanish. Annotators include tokenization, part of speech tagging, named entity recognition, entity linking...
متن کاملMultilingual Coreference Resolution
In this paper we present a new, multilingual data-driven method for coreference resolution as implemented in the SWIZZLE system. The results obtained after training this system on a bilingual corpus of English and Romanian tagged texts, outperformed coreference resolution in each of the individual languages. 1 I n t r o d u c t i o n The recent availability of large bilingual corpora has spawne...
متن کاملExploring Authorial Identity in terms of Voice Intensity and Subject-Positioning in the Argumentative Writings of Male and Female Iranian Advanced EFL Learners
Academic writing is not just about presenting a set of ideas, but through the act of writing, the authors position themselves as individuals having particular identities which mostly reflect the dominant sociocultural values and practices of the discourse communities in which they are living and performing. The present study, using a mixed method approach, attempted to explore the evidences of ...
متن کامل